This document is also available in these non-normative formats: XML.
Copyright © 2003 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
This document describes internationalization usage patterns and scenarios for Web services and is intended for review by W3C members and other interested parties. This version provides additional guidance for implementers of Web service technologies, suggesting methods for dealing with general international interoperability issues in services and service descriptions. One goal of this document is to provide a template for Web service designers to implement international capabilities in their services.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. The latest status of this series of documents is maintained at the W3C.
This is an updated working draft describing Web Services internationalization usage scenarios for review by W3C members and other interested parties. It has been produced by the Web Services Internationalization Task Force of the W3C Internationalization Working Group, as part of the W3C Internationalization Activity.
Discussion of this document takes place on the public mailing list
public-i18n-ws@w3.org.
To contribute, please subscribe by sending mail to
public-i18n-ws-request@w3.org
with subscribe
as the subject. The
archive of this list can be read
by the general public.
We invite contributions of additional Usage Scenarios and Use Cases to document aspects of Web Services internationalization that are not covered yet in this document. For contributions, please use a format similar to the one used in this document. Please send your contribution or comment to the www-i18n-comments@w3.org mailing list (public archive). Please use [Web Services] or [WSUS] in the subject.
At the time of publication, the Working Group believed there were no patent disclosures relevant to this specification. A current list of patent disclosures relevant to this specification may be found on the Working Group's patent disclosure page.
This document is work in progress and does not imply endorsement by, or the consensus of, either W3C, or members of the Web Services Task Force of the W3C Internationalization Working Group. This document still contains incomplete descriptions in various places.
This document is a draft version, and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress". A list of all technical reports can be found at http://www.w3.org/TR/.
1 Introduction
1.1 Scope
2 Framework
2.1 Overview
2.2 Natural Language and International Preferences
2.2.1 WSDL
2.2.2 SOAP
2.2.3 Faults and Errors
2.3 Language and Context Negotiation Patterns
2.3.1 Language Neutral
2.3.1.1 Example: GetTimeService returns the current time
2.3.1.2 Usage Pattern
2.3.1.3 WSDL
2.3.1.4 SOAP
2.3.1.5 Faults
2.3.2 Service Determined
2.3.2.1 Example
2.3.2.2 Implementation or Deployment Decision
2.3.2.3 Enumerated Set
2.3.2.4 Service Determined
2.3.2.5 Usage Pattern
2.3.2.6 WSDL
2.3.2.7 SOAP
2.3.2.8 Faults
2.3.2.9 S-013: Service Determined Language Preference Leads to Fault
2.3.2.9.1 Scenario Definition
2.3.2.9.2 Description
2.3.3 Client Influenced
2.3.3.1 Example
2.3.3.2 Usage Pattern
2.3.3.3 WSDL
2.3.3.4 SOAP
2.3.3.5 Faults
2.4 Passing or Matching International Preferences
2.5 Natural Language Handling in Faults
2.5.1 S-005: Language Matching for Fault Reason Messages
2.5.1.1 Scenario Definition
2.5.1.2 Description
2.5.2 S-009: Language Preference for One Way Messages
2.5.2.1 Scenario Definition
2.5.2.2 Description
2.6 Ordering, Grouping, and Collation in Services
2.7 Descriptive Text in Service Descriptions
3 Usage Scenarios
3.1 Data Integrity and Interoperability
3.1.1 S-001: Data Integrity using an Unicode-based Encoding
3.1.1.1 Scenario Definition
3.1.1.2 Description
3.1.2 S-002: Data Integrity using a Legacy Encoding
3.1.2.1 Scenario Definition
3.1.2.2 Description
3.2 Language (Locale) Negotiation for SOAP Fault Messages
3.2.1 S-007: Language Preference for Chained Services
3.2.1.1 Scenario Definition
3.2.1.2 Description
3.2.2 S-008: Locale Sensitive Formated Data in SOAP Fault Messages
3.2.2.1 Scenario Definition
3.2.2.2 Description
3.2.3 S-010: Language Preference for Multiple SOAP Bindings
3.2.3.1 Scenario Definition
3.2.3.2 Description
3.2.4 S-011: HeaderFault vs. Fault
3.2.4.1 Scenario Definition
3.2.4.2 Description
3.2.5 S-012: Interaction of Language Negotiation and Caching
3.2.5.1 Scenario Definition
3.2.5.2 Description
3.3 Locale Neutral vs. Locale Sensitive Data Exchange
3.3.1 Background
3.3.2 S-016: Locale Neutral Formated Data
3.3.2.1 Scenario Definition
3.3.2.2 Description
3.3.3 S-017: Data with Additional 'Attributes'
3.3.3.1 Scenario Definition
3.3.3.2 Description
3.3.4 S-018: Data with Default 'Attribute'
3.3.4.1 Scenario Definition
3.3.4.2 Description
3.3.5 S-019: Locale Dependent Datatypes
3.3.5.1 Scenario Definition
3.3.5.2 Description
3.3.6 S-020: Correlation of Data among Services in Different Languages
3.3.6.1 Scenario Definition
3.3.6.2 Description
3.4 Locale Sensitive Presentation
3.4.1 S-021: Data Formatting for End User on Receiver Side
3.4.1.1 Scenario Definition
3.4.1.2 Description
3.4.2 S-022: Data Formatting on Sender Side
3.4.2.1 Scenario Definition
3.4.2.2 Description
3.4.3 S-023: Data Formatting on Receiver Side according to Sender
3.4.3.1 Scenario Definition
3.4.3.2 Description
3.5 Locale Sensitive Data Processing
3.5.1 S-024: Locale Sensitive Processing by Provider(Receiver)
3.5.1.1 Scenario Definition
3.5.1.2 Description
3.5.1.2.1 Transport Layer (HTTP, SMTP/MIME, etc.)
3.5.1.2.2 Service Provider Layer
3.5.1.2.3 Service Layer
3.5.2 S-025: Opaque Identifier to Identify a Locale
3.5.2.1 Scenario Definition
3.5.2.2 Description
3.6 Finding Services
3.6.1 S-026: Searching for Web Services
3.6.1.1 Scenario Definition
3.6.1.2 Description
3.6.2 S-027: Fall-Back for Internationalized Web Services
3.6.2.1 Scenario Definition
3.6.2.2 Description
3.7 Services for Internationalization Functionality
3.7.1 S-028: Outsourcing Locale-Related Services
3.7.1.1 Scenario Definition
3.7.1.2 Description
3.7.2 S-029: Propagating Updates related to Locales
3.7.2.1 Scenario Definition
3.7.2.2 Description
3.8 Development of Internationalized Web Services
3.8.1 S-030: Internationalizing an Existing Web Service
3.8.1.1 Scenario Definition
3.8.1.2 Description
3.8.2 S-031: Communicating Available Options
3.8.2.1 Scenario Definition
3.8.2.2 Description
3.9 Template
3.9.1 Title
3.9.1.1 Scenario Definition
3.9.1.2 Description
4 Use Case
A References (Non-Normative)
B Acknowledgements (Non-Normative)
C Heisei (Non-Normative)
This document describes a variety of Web Services internationalization usage scenarios and use cases.
The goal of the Web Services Internationalization Task Force is to ensure that Web Services have robust support for global use, including all of the world's languages and cultures.
The goal of this document is to examine the different ways that language, culture, and related issues interact with Web Services architecture and technology. Ultimately this will allow us to develop standards and best practices for those interested in implementing internationalized Web Services. We may also discover latent international considerations in the various Web Services standards and propose solutions to the responsible groups working in these areas.
Web Services provide a world-wide distributed environment that uses XML based messaging for access to distributed objects, application integration, data/information exchange, presentation aggregation, and other rich machine-to-machine interaction. The global reach of the Internet requires support for the international creation, publication and discovery of Web Services. Although the technologies and protocols used in Web Services (such as HTTP [RFC2616], XML [XML], XML Schema, and so forth) are generally quite mature as "international-ready" technologies, Web Services may require additional perspective in order to provide the best internationalized performance, because they represent a way of accessing distributed logic via a URI.
As a result, this document attempts to describe the different scenarios in which international use of Web Services may require care on the part of the implementer or user or to demonstrate potential issues with Web Services technology.
This document describes the followings scenarios:
Locale neutral vs. locale-sensitive XML messages and data exchange
Interaction between Web services and the underlying software system's international functionality
Message processing in Web Services, e.g. SOAP Fault messages etc.
The scope of this document is described in Section 1.1 below.
This document follows the definition of Web Services specified in Chapter 2 of the Web Services Architecture document [WSA].
Definition: A Web service is a software system identified by a URI, whose public interfaces and bindings are defined and described using XML. Its definition can be discovered by other software systems. These systems may then interact with the Web service in a manner prescribed by its definition, using XML based messages conveyed by Internet protocols.
In order to narrow down the scope, the usage scenarios in this document are limited to the following W3C technologies and deliverables:
Web Services Architecture Documents [WSA] [WSAUS] [WSAR] [WSAG]
SOAP V1.2 Documents [SOAP-0] [SOAP-1] [SOAP-2] [SOAP-AF] [SOAP-EB]
WSDL V1.2 Documents [WSDL-V12] [WSDL-B]
Internationalized Web Services need to be easy to create, publish and discover for a wide range of audiences.
The scope of Web Services internationalization is W3C internationalization working group deliverables, including:
Web services requirements
Unicode technologies and deliverables
Concepts and application of distributed locales and locale-affected preferences
Web services internationalization best practices
This section contains a "framework" or outline for understanding international issues in Web services.
The framework is based on the Web Services Architecture document [WSA], which defines a service as follows: "A service is a set of actions that form a coherent whole from the point of view of service providers and service requesters. A requester entity is a legal entity that wishes to make use of a provider entity's Web service. It will use a requester agent to exchange messages with the provider entity's provider agent. The provider agent has one ore more services available to it, that it can invoke in behalf of the requester agent."
The mechanics of the message exchange are (partially) documented in a Web service description (WSD).
The sequence of events in this framework are as follows:
The requester agent locates a suitable provider agent. This can be accomplished through UDDI, but also can be accomplished through other means. For example, the URL of the provider agent may simply have been found in an advertisement somewhere.
The provider agent makes available a Web Services Description (WSD) document.
Using the information in the WSD, the requester agent can formulate a service request. This will be a SOAP message which is then sent to the provider agent to be acted upon.
The provider agent after receiving the request will invoke the service and get a response. The response can be the results of the service or an indication that a fault occurred. Note that the interaction between the provider agent and the service are independent of the Web Services framework and the design is left completely to the implementors. The primary requirement is that the provider agent in turn be able to formulate a response to return to the requester agent. This response must satisfy both the requirements and specifications of the Web Services Architecture and the description of the WSD.
If the service request was successfully executed, the provider agent will formulate a response message and send it to the requester agent.
If the service request was erroneous, or the service could not be completed for some reason, a fault message will be sent to the requester agent.
The internationalization issues in Web services and as illustrated in the framework fall into several categories which are common to all Web Services, regardless of the message exchange pattern used for a specific service. In the section that follows it is assumed that the service, provider and requester agents, and data structures (semantics) follow best practices in internationalization and data structuring. Implicit in these descriptions is the expectation that data structures use XML Schema types to create locale-neutral data structures.
Some services may be implemented that do not follow these strictures for reasons having to do with legacy system implementation or other restriction. These cases are dealt with in usage scenarios later in this document.
"Natural Language" refers to human use of language. Generally systems that are internationalized can produce messages in a variety of different natural languages. These systems are referred to as "localized", because their messages (and frequently behavior) are tailored to the individual cultural expectations for a specific target market or group of individuals.
International preferences are similar to "Natural Language" identification. Some of the other preferences that a service might be interested in include:
Collation (sorting), often specified with xml:sort
Alternate Calendar
Time Zone or offset
Tax Regime
Default Currency
...and many more
Some of these preferences may be inferred from the natural language by converting the natural language preference to the service host's "locale" (Note that the provider agent and service host are not always the same physical host). Other items (such as timezone) are orthogonal or (like collation) imperfectly or incompletely described by a natural language identifier. Separating these values requires forethought in the design of the service and the setting of reasonable default values.
In the sections that follow, you will see the word "locale" used as an adjunct to natural language. A locale is not a language and the language tags discussed in the succeeding sections should not be confused with actually being locale tags or identifiers. However, there is commonly a close relationship between the language identified by such a tag and the corresponding locale in the underlying platform and a software process may choose to use language tags to select many of these additional operational settings or international preferences.
Distributed processing, as with Web services, must allow for several patterns of behavior in the back end implementation represented by the service.
There are three patterns that such a service may follow. These are:
Language/Locale Neutral
Service Determined
Client Influenced
In each of these patterns, the Web service description (commonly WSDL) and actual protocol or invocation (SOAP is used in our examples) should reflect the requirements of the service's own pattern of behavior.
Web service descriptions should consider how to communicate language or locale-of-operation choices in a consistent manner. In the sections that follow, specific patterns are recommended as good canonical references. However experience shows that a specific implementation may require additional contextual information not conveyed with a simple language tag. Generally this type of additional information should be encoded into the data structure defined for actual interchange in the message body (such as a soap:body block), rather than as additional header information as shown in some of the examples below. This is because specific implementation decisions should be expressed as part of the service's signature: you may require additional or different data in future versions.
In the examples below, adoption of a generic method for exchanging "international contextual information" will allow implementations to better model the natural language and locale processing choices offered by the services.
In all cases, the implementer should consider adding a language tag to any operation fault elements to show what language to expect fault messages to be generated in.
In all cases, descriptive text should be tagged with its actual content language using the xml:lang attribute (where permitted). Consideration should be given to providing documentation within services in alternate languages when the service is expected to be utilized by users such as those in other countries or who speak other languages.
In general, SOAP documents should structure data elements in ways that make the most sense for the specific underlying implementation. In the examples given, the user's natural language is passed in an optional header element separate from the specific data structures required to operate the underlying service logic.
This is by design.
Software developers generally get their language resources (translated messages and other locale-specific data) from their programming environment. This functionality is implemented in many ways, but the pattern for writing the logic is always similar: the language and locale preferences are not included in the parameter list of the service itself because the processing environment (JVM, OS, .NET framework, etc.) maintains this information. SOAP Processor implementations should be designed to recognize natural language information passed in the transport (such as HTTP Accept-Language) or in SOAP headers as defined in this document or in the specific implementation-dependent extension of this model and populate or set the appropriate values in the service's environment.
For example, a .NET SOAP Processor might set the service's thread default CultureInfo using the language tag. A J2EE implementation might populate the ServletRequest Locale property with a java.util.Locale constructed from the ISO639 and ISO3166 fields embedded in a language tag. And so forth.
Fault message "text" elements must be labelled with an appropriate language identifier, as defined in XML 1.0. That is, an xml:lang tag containing an RFC3066 (or its successor) language identifier. If the transport provides the user's language preference (such as HTTP Accept-Language), then that language or set of languages should be preferred, followed by the SOAP Processor machine's local language preference.
Ideally there should always be a "message of last resort" included in the fault. In many cases this message may be in English, but consideration should be given to the likely users of the system, including the administrators trying to puzzle out the error. Numeric (or ASCII-only alpha-numeric) error codes should be considered for inclusion in all fault messages. This may provide valuable reference when the text of the message itself is in a language not understood by the recipient.
When designing specifications intended for interoperability between vendors or implementations, consideration should be given to enumerating the possible faults in advance so that reference numbers can be universally and consistently referenced by disparate implementations.
As noted above, there are three general patterns or policies that may be applied to any specific Web service. These three are:
Language Neutral
Service Determined
Client Influenced
A language neutral service generally is one that executes in the same way regardless of the current runtime locale or user preferences regarding language. This implies that all or most strings embedded in the service are not human readable. This is, by far, the most common pattern. In a language or locale neutral service, external factors do not alter the way the service performs. An example of this would be a service that adds two integers together: 2+2 = 4 in most every locale.
GetTimeService can be written as a locale-neutral service. Time of day, although it is presented with different formats around the world, is measured the same way everywhere and a standardized single frame of reference is available to be used. So a service request could return with a response that included the current UTC (Coordinated Universal Time) time in ISO 8601 format: hh:mm:ss.sss, i.e. as an XML Schema Part 2: Datatypes [XMLS-2] time type.
Any application (i.e. a consumer) that triggers an event causing the GetTimeService request to be sent to a Service Provider, can transform the result into a local or other time format, including perhaps shifting the time into the local time zone. With this proposed implementation, from the requester agent to the service provider, the service and back to the requester agent, the request document and its service implementation are entirely independent of the locale of the client, the host, and the implementer. Hence the name locale-neutral.
Of course, the request could also be implemented with dependencies on locale of the client or the host and this would complicate implementation and increase the probability of errors in making requests, deploying the service, or implementing the service.
Language neutral services, being the default, do not generally announce themselves in their service contracts, although (other scenarios apply here).
The Web services description does not require any extra information in order to perform its operations or communicate its capabilities. Therefore no additional fields beyond those required by the actual service implementation should be defined in any of the bindings or operations.
The implementer should consider adding a language tag to any operation fault elements to show what language to expect fault messages to be generated in.
Fault messages may be generated in a variety of languages simultaneously. If the transport provides a way of determining the requester agent's preferred language (such as the Accept-Language header in HTTP), then these preferences should be used first. In all cases, a "message of last resort" should be generated, which generally should be in the language and locale of the provider agent or the language most preferred by the administrator of that system (since that is who will have to debug any service requests against the system). Many systems are run in default locales, which often default to the English language. This should be a consideration in configuring the system.
In the Service Determined pattern, the implementation or configuration of the service itself or the provider agent determine some aspects of the processing performed by the service. Generally this can be thought of as a design-time or deployment-time decision, rather than a runtime aspect of the service. That is, the service can be relied on to perform its work in a consistent manner using a predetermined language or locale setting.
A service provider is in 3 country markets: U.S., Japan, and Germany. Each market is supported by a local warehouse. One of the services that is provided indicates availability of parts. For a given part number, the following information is returned: part number, quantity, language, description, size, list price.
If all of the information was maintained in a single database one might see entries like:
location | part number | quantity | language | description | size | currency | value |
---|---|---|---|---|---|---|---|
U.S. | 123 | 51 | en-US | 6 pack budweiser | 12oz. | USD | 5.99 |
Japan | 888 | 10 | en-US | 6 pack kirin | 591ml | JPY | 999 |
Germany | 500 | 20 | en-US | 6 pack Beck's | 300ml | EUR | 8.00 |
@@@@add some native Japanese and german language entries.
However, this company has not yet implemented a worldwide inventory application and instead each warehouse maintains its inventory in a separate and independent database and uses different applications to manage inventory. In fact, the Japanese warehouse is the only one that uses a system that supports the Japanese language. Because of its support for Japanese and its use of legacy encodings instead of Unicode, the Japanese system cannot properly support German text.
The service provider has implemented a service, GetProductInventory.
The input (inbound message) takes a part number or part ID. The output (outbound message) returns the following information: part no., quantity, language, description, size, list price.
An implementation decision might be taken to provide separate Web services for each warehouse. Customers on each continent will be directed to a particular web service that returns the information from the appropriate warehouse. This pattern is service-determined as it reflects the inherent nature of the service in each part of the world. (That is the German warehouse supports clients in Germany.)
This example also demonstrates the "service determined" type of limitation: it's capabilities are limited by its use of legacy applications. If the databases were consolidated and suitable shipping facilities were made available, the service could still not offer product from any warehouse to any client. The Japanese clients using the Japanese service provider would not be able to see all of the potential German beer names. (They could only see them if the names were transliterated to Japanese or a subset of the Japanese legacy encoding, such as 7-bit ASCII.)
There are several sub-patterns for this.
The service may be written to assume a specific locale or language setting or the server which hosts the service or provider agent may be statically configured to use a specific locale. This is generally a poor choice, since it implies a non-internationalized implementation to start with. However, legacy code or other choices may require this pattern and not be in the control of the Web service.
The service may be enabled for or capable of internationalized operation, but it may be configured to process all requests in just a few (or, more likely, just one) language or locale. This allows the service to be deployed to many locales or for it to serve several markets simultaneously, depending on how the service and/or its provider agent is configured.
Service determined language or locale matching is a pattern that is useful when the service's performance is explicitly linked to what it does. For example, a service that returns whether the New York Stock Exchange will be open or closed on a particular day is strongly linked to the holiday schedule in the USA. The service may be written in such a way that it can also be used by other exchanges in other countries, but the specific service might be written simply to support the needs of this one use.
Similar to that of Language Neutral, except that the server's local language preference is expected to affect processing. In most cases the difference between Language Neutral and Service Defined is that the latter returns values that have elements consisting of natural language text or are otherwise locale affected, whereas a Language Neutral service returns values that can be formatted entirely by the requester agent or requester.
The Web services description may provide an optional header field in the operation output element defining the natural language used by the service at execution time. Since this value is at least partially machine dependent, the value probably should not be set in the actual service description. Instead the soap:header or equivalent field should merely be defined as being available. If the service uses an enumerated set of languages, it might be wise to ensure that the actual language used is always passed back in any output.
This Web service pattern is generally similar to that of Language Neutral. Processes that are language or locale affected may retrieve their settings from the default machine settings and the SOAP Processor generally need not provide any special functionality to activate or achieve this.
Faults are generated as with Language Neutral, except that the service specific language should also be generated after any user specified value, but before that of the SOAP Processor, assuming that these three are all different values and that all are available.
Service "A" is defined on Provider "A", running in an English locale. The service requires that the value of string variable "foo" be less than "zzz".
Requester "A" is running in a French locale and sends a SOAP request with foo="écrit".
Provider "A" returns a fault, even though in French the string is "less than" the evaluated value.
In this pattern, the service will attempt to match its processing to the language specified by the user. The service should therefore provide a way for the user's preference to be communicated, regardless of transport, and the actual value used to perform the processing should be returned in any response that may be generated.
If the service determined example (GetProductInventory) were upgraded so that data was globally accessible, then the pattern could be changed to the client-influenced pattern. In this case, the Web service GetProductInventory, could return information about inventory from all warehouses or any specific warehouse, but would take advantage of the user's language and locale preferences to influence how choices are returned, the ordering of the choices, and even which choices are displayed.
For example, a user that has a Japanese language preference might see only the inventory counts for the products available in the Japanese warehouse, with Japanese language description. Although the locale-influenced web service may return the "best" responses for a majority of users, it may be wrong for some percentage, which on the web may be a large number of users. For example, Japanese-speaking clients in U.S. would be better served (perhaps) by inventory from the U.S. warehouse.
A more precise web service design would make the selections of warehouse and description language, currency, etc. explicit. However, it is not always possible to present the client all of the information up front for them to make good choices on a form with many explicit options. So client-influenced patterns are often used for this purpose.
In this usage pattern, the service description and actual execution should be designed to allow the requester agent to pass the most appropriate value for the specific invocation of the service.
Both inbound and outbound headers define a attribute bound to type xsd:language and labelled with an xml:lang element. The inbound header should allow multiple values to be passed, in case the first preference is not available. The outbound value should be a single discrete value: that actually used by the service at execution time.
Inbound and outbound headers optionally take the element defined in the WSDL. If not present, the SOAP Processor's local language preference is used. Outbound headers should reflect as accurately as possible the actual value used in performing the processing. This may vary from that specified by the user according to the rules in RFC3066 (or its successor). If no match may be obtained, the results will be implementation defined--the service may generate a fault if processing can go no further, but most implementations will probably use the current provider agent's default locale for the value.
Implementers may also have to define more complex international behavior, beyond that described by a mere language choice. It is common for these design decisions to be specific to the particular application or particular market being serviced. The use of a "locale" or language preference as a short hand for these more complex requirements should be carefully considered, and possibly discouraged, in favor of making the specific information required for proper operation explicit in the service contract.
Nonetheless, in some cases the service implementer may wish to use the language or locale preference of the end user to determine how the service's processing should proceed.
* various examples * ... etc. @@@needs more work here
SOAP Version 1.2 allows to send fault messages with reason texts in multiple languages. SOAP Version 1.2 Part 0: Primer [SOAP-0] explains the <Reason> element as follows: "It must have one or more env:Text sub-elements, each with a unique xml:lang attribute, which allows applications to make the fault reason available in multiple languages. (Applications could negotiate the language of the fault text using a mechanism built using SOAP headers; however this is outside the scope of the SOAP specifications.)"
This mechanism is suitable for returning faults in an environment in which the number of languages is relatively small and the range of languages to be returned is known in advance.
Many actual SOAP implementations are localized into many languages simultaneously. To prevent faults from becoming overly large and difficult to manage, implementations should really include some strategy that reduces the set of languages to a minimum and attempts to match the language of the fault as closely as possible to the client that ends up viewing the message.
Ideal implementations will include mechanisms for "late localization" of the values.
Future versions of SOAP should probably consider allowing additional structured information in a Fault so that suitably internationalized clients can perform the localization and formatting themselves.
The service requester needs to select a matching language from the list of fault reasons returned by the service provider. This scenario illustrates an issue that implementations have to take into account.
The requester may be unable to (validly) match the returned text values to its current (end user) language.
<env:faultReason> <env:Text xml:lang="en-US">Processing error</env:Text> <env:text xml:lang="cs">Chyba zpracování</env:Text> </env:faultReason>
If the requester prefers "en-GB", then neither string will match directly for the current requester language preference. Although it is apparent to a human that en-US is a reasonable match for en-GB, automated processes are not permitted to make the assumption that languages with common prefixes are mutually understandable. If the requester prefers "ja", then selecting the best fallback is even more difficult.
Applies to: SOAP, or an extension of SOAP
Service "A" is defined to receive a message from Requester "A" and deliver to Requester "B" via Service "B". An example of this would be similar to a mail server triggered by a Web service.
Requester "A" calls Service "A".
Service "B" is unable to complete its message transaction and generates a fault.
Service "B" should return a message in a language that matches Requester "B"'s language preference (so that the administrator of that system can use it). In addition, if Requester "A"'s preferences are available to Requester "B" (that is, Service "A" got the preferences as part of its input or via an external mechanism such as the transport), then Requester "A"'s language preference should be included in the SOAP fault reason text.
For example:
Requester "A" has a language preference of "fr-FR"
Requester "B" is running in an environment with a language preference of "de-DE"
Service "B" is running in an environment with a language preference of "en-US"
<env:Reason> <env:Text xml:lang="fr-FR">French error</env:Text> <env:Text xml:lang="de-DE">Verarbeitungsfehler</env:Text> <env:Text xml:lang="en-US">Processing error</env:Text> </env:Reason>>
Some types of internationally sensitive processing cannot be inferred solely from a language identifier. Collation (sorting) is such a process. The collation may even affect the results that one gets from a Web service. For example, if the service selects "all offers > c", in certain locales you might receive back entries starting with "ch" (which is treated as a separate letter). Or if the service returns "all items < z", in certain locales this may not include accented letters such as Å (A-RING).
See S-009 above for a similar example.
Service descriptions are human-readable text intended to describe what the service does and how it should be used. To be useful, the description needs to be a natural language sentence or even a set of keywords in the language that the likely user audience will understand. The should be a way to tag the content with the specific language that it is in and to allow multiple languages. Otherwise false positives or negatives will result.
A sender and a receiver wish to use an Unicode-based encoding to transmit data between each other. This scenario provides an example of best practice.
SOAP is used for XML messages exchanging data among nodes. Because all XML [XML] processors must be able to read entities in both the UTF-8 [RFC2279] and UTF-16 [RFC2781] encodings, using UTF-8 or UTF-16 guarantees character encoding interoperability on the SOAP layer. The Character Model for the World Wide Web [CHARMOD] document describes these considerations and guidelines.
Applies to: SOAP, SOAP interface of software systems
A sender and a receiver wish to use a legacy (i.e. not Unicode-based) encoding. This scenario provides an example of good practice, but using an Unicode-based encoding (see ) is preferable and Web service implementers should avoid legacy encodings wherever possible to ensure interoperability.
This scenario is divided into three aspects.
A sender sends a SOAP message using a legacy encoding.
A receiver receives a SOAP message using a legacy encoding.
Interaction between SOAP layer and SOAP interface such as programming language, middleware, operating systems etc.
When a sender sends a SOAP message in a legacy encoding to a receiver, the receiver may not be able to accept the legacy encoding. In order to use a legacy encoding in a SOAP message, the sender needs to know beforehand whether the receiver is able to accept the legacy encoding in question.
Receiver passed SOAP messages to programming languages, middleware, or operating systems via SOAP interface. Through the interaction of SOAP interface, code conversion from non-Unicode encoding to Unicode may happen.
The XML Japanese Profile [XML-JP] describes that using legacy encodings such as Shift_JIS cannot provide complete interoperability in information interchange; there are slight differences among platforms in the mapping tables they use for this and similar encodings.
Applies to: SOAP, SOAP interface of software systems
Service "A" defines an optional header, as in section 2.3.3, containing a language request field. This service is deployed.
Service "B" defines a service that includes a call to Service "A", but does not define a natural language header.
A requester calls service "B" with the arguments provided by service "B". Service "A" therefore receives no language preference or Server "B"s provider's preference.
Applies to: SOAP, or an extension of SOAP
Service "A" is defined on Provider "A". A fault is generated during invocation that returns a faultReason that includes values for which the presentation depends on locale.
"The date provided, 12 November 2201, was too late."
"The argument 12345.678 was too large."
"The argument 12345,678- was too small."
The provider should format substitutions in each message according to the language (and implied locale) of the message, not according to the locale of the provider or service. In the case of Language Neutral or Service Determined patterns, it may not be possible to generate a message in the user's preferred language (or the preference may not be available). In these cases, the message should follow the language preference of the provider or service host.
Applies to: SOAP, or an extension of SOAP
SOAP message variation if http or ftp or SMTP or RMI or IIOP (difficulty deploying in a single WSDL)
Service "A" is defined on Provider "A".
The administrator of Provider "A" wishes to deploy the same service on several bindings simultaneously in the same WSDL file.
Some protocols, such as HTTP or, to a lesser extent, SMTP, contain headers or other information that can be used to communicate the requester's language preference. If a service is designed to be wholly or partially sensitive to the requester's language preference, it should include an optional header in the Web service description with the name "lang" which uses the type xsd:language. This value should be used preferentially by the service when deciding on the language preference of the requester. If the value is not supplied or the value is not installed (available) locally, then the provider may also choose to process information provided by the transport protocol. If no values are available or supplied, then the provider may choose some reasonable default to use in the service.
Service "A" is defined on Provider "A" and invoked by Requester "B".
While processing the request, Provider "A" detects a problem with the header. Provider "A" returns a SOAP fault with Reason text.
Later Requester "B" invokes Service "A" again. This time the actual service generates a fault (and presumably supplies the falut reason text). For example a Java-based service might throw an Exception to which it passes a string containing information about the fault.
Provider "A" should, if possible, try to resolve any reason Text elements into the languages requested by Requester "A". In some cases this may not be possible because the language in question is not available or the design of the error handling subsystem does not allow multiple language resolution of the fault. In this case, Provider "A" should return the message provided by Service "A" and labeled with the local language (or the language of the actual message, if known).
Caching may affect the results of language negotiation. (This usage scenario is still work in progress.)
Caching is not (yet) well defined or described in Web Services architecture and SOAP 1.2.
If caching is permitted and is separate from service execution and there is no standardized language negotiation mechanism, it is possible that cached responses in the wrong language could be returned.
As noted in the Framework section, data structures should be designed wherever possible to be locale neutral. In the case of certain applications this is not possible. In general care should be used to avoid locale-sensitive data representations.
For example, for data transmitted in textual form the following cases can be distinguished because of the actual formatting:
Data formatted using a (locale-)specific convention. Example: A Japanese date such as "平成15年5月16日".
Data formatted in a way that is widely understood by different software. Example: A date in the format used by XML Schema (Part 2, Datatypes)[XMLS-2]: 2003-05-16.
A similar distinction can be made with respect to semantics:
Data that has semantics that depend on culture. For example, the XML Schema datatype 'gMonth' is bound to the Gregorian calendar. A value of this datatype such as '5' (referring to May in each year of the Gregorian calendar) cannot be converted to calendars that do not have their months aligned with the months of the Gregorian calendar.
Data that has semantics independent of culture. Example: the datatype 'integer' from XML Schema (Part 2, Datatypes)[XMLS-2].
Please note that there are formats that use conventions related to some specific culture for the actual representation while their semantics are culture-independent. For example, although XML Schema requires the use of Western digits, which makes the actual formatting culture-dependent, integers can be formatted according to different cultural conventions without problems.
A sender wishes to send data to a receiver. The sender does not know the kind of formatting conventions that the receiver is using for various types of data. This usage scenario does not raise any requirements. It provides an example of best practice.
If the sender does not know the kind of formatting conventions that the receiver is using for a particular type of data, the sender has to use well-known, locale-neutral formatting conventions. XML Schema (Part 2, Datatypes)[XMLS-2] provides such such formatting conventions. Although SOAP can support various XML encodings, SOAP specifies the common data types such as SOAP encoding which is for general cases: e.g. RPC scenario for basic interoperability. SOAP encoding is based on XML Schema.
The following data types are related to date and time. They are built-in primitive types which are specified in XML Schema Part 2.
TYPE : EXAMPLE ---------------- --------------------------- date : 2003-05-31 time : 13:20:00 tz : +09:00 dateTime : 2003-05-31T13:20:00+09:00 double : 1267.43233E12 decimal:integer : 2678967543233
The above data types keep interoperability from a lexical analysis perspective. When a SOAP sender sends e.g. a date to a SOAP receiver using the above date type, both sender and receiver can interpret the date lexically.
On the other hand, dates and date-time values are affected by the user's location (timezone). For example, 2003-05-31 in Korea is not exactly the same period of time as 2002-05-31 in France, but it is the same as 2003-05-31 in Japan. Date varies by time-zone which is related to regional government and geography etc. In processing SOAP encoding or XML Schema data types, it is necessary to distinguish locale-neutral data formatting and locale-neutral data semantics. Using SOAP encoding or XML Schema ensure locale-neutral data format, but does not always ensure locale-neutral data semantics.
For example, a Web service that returns whether the New York Stock Exchange is open or closed on a specific day can accept a date like 2002-05-31 as location independent--the exchange is either open or closed that day. However, for example, a service that places a "limit order" (that is, a purchase of stock that has specific limits on it, one of which may be an expiration date if the other limits are not met) might tie the expiration date to the specific time the market closes in New York that day rather than to an arbitrary other timezone.
The following example is a SOAP message in SOAP encoding. Although all of the following data are locale-neutral formatting, the context of data differs from one location/culture to another.
<?xml version='1.0' ?> <env:Envelope xmlns:env="http://www.w3.org/2002/06/soap-envelope"> <env:Header> </env:Header> <env:Body> <c:getAvailableFlightResponse xmlns:c="http://trip.example/query"> <c:price>123.55</c:price> <c:departuredate>2003-05-31</c:departuredate> <c:departuretime>13:30:00</c:departuretime> </c:getAvailableFlightResponse> </env:Body> </env:Envelope>
SOAP transactions can use SOAP encoding/XML Schema to assure that there is no ambiguity in transactions between senders and receivers in order to keep lexical analysis interoperability. Additional attributes may be needed to assure the data semantics. Detailed scenarios are described in the next section.
Applies to: Web Services in general
A sender wishes to send monetary data to a receiver. Each of the monetary amounts has a specific currency. This usage scenario does not raise any requirements. It provides an example of best practice.
SOAP messages are sent to a receiver using XML schema data types such as integer, double, float, date, time etc. However, these data types may have locale sensitive semantics such as currency and location. In order to add data semantics, adding attributes is frequently used for XML messaging scenario.
<?xml version='1.0' ?> <env:Envelope xmlns:env="http://www.w3.org/2002/06/soap-envelope"> <env:Header> </env:Header> <env:Body> <c:getAvailableFlightResponse> <c:price c:currency="USD">123.55</c:price> <c:departuredate c:location="JFK">2003-05-31</c:departuredate> <c:departuretime c:location="JFK">13:30:00</c:departuretime> </c:getAvailableFlightResponse> </env:Body> </env:Envelope>
Another option is to use a structure consisting of two elements for RPC scenario using SOAP encoding.
<?xml version='1.0' ?> <env:Envelope xmlns:env="http://www.w3.org/2002/06/soap-envelope"> <env:Header> </env:Header> <env:Body> <c:getAvailableFlightResponse> <c:price> <c:currency>USD<c:currency> <c:amount>123.55</c:amount> </c:price> <c:departuredate> <c:location>JFK</c:location> <c:date>2003-05-31</c:date> </departuredate> <c:departuretime> <c:location>JFK</c:location> <c:date>2003-05-31</c:date> </departuretime> </c:getAvailableFlightResponse> </env:Body> </env:Envelope>
Applies to: Web Services in general
A service is designed around a single currency and therefore did not associate financial data with currency symbol or codes. However, the responses from this service may in turn be used by multicurrency services or incorporated into other messages. Therefore the response should be modified to identify the currency used by the service. There are 3 possible solutions. Two involve modifying the message format. The message can be changed so that a currency element is added to any node that has financial data elements. An alternative is to provide the currency as an attribute for any financial data. However, it may be unacceptable to change the XML message format used by the service. A third alternative is to indicate the currency used throughout the message in the SOAP header.
The following example demonstrates multiple currency data transmission in a SOAP message and the currency code being provided in a separate element along with the value.
<?xml version='1.0' ?> <env:Envelope xmlns:env="http://www.w3.org/2002/06/soap-envelope" > <env:Header> </env:Header> <env:Body> <c:purchase> <c:apple> <c:currency>JPY</c:currency> <c:amount>123.55</c:amount> </c:apple> <c:orange> <c:currency>USD</c:currency> <c:amount>325.78</c:amount> </c:orange> <c:peach> <c:currency>EUR</c:currency> <c:amount>36.55</c:amount> </c:peach> </c:purchase> <env:Body> </env:Envelope>
The following is an example which has the default currency in SOAP header. It is also possible to specify the default currency attribute in SOAP body instead of SOAP header.
Although adding parameter to SOAP body requires design change for service interface, adding default value into SOAP header does not affect services interface.
<?xml version='1.0' ?> <env:Envelope xmlns:env="http://www.w3.org/2002/06/soap-envelope" > <env:Header> <WS-I18N:WSinternationalization xmlns:WS-I18N="http://example.org/2002/11/21/WS-I18N"> <WS-I18N:Currency>JPY</WS-I18N:Currency> </WS-I18N:WSinternationalization> </env:Header> <env:Body> <c:purchase> <c:apple> <c:price>123.55</c:price> </c:apple> <c:orange> <c:price>325.78</c:price> </c:orange> <c:peach> <c:price>36.55</c:price> </c:peach> </c:purchase> </env:Body> </env:Envelope>
Applies to: Web Services General
A sender wishes to send locale dependent data to a receiver for regional SOAP messaging or RPC. The receiver needs to process the locale dependent data correctly.
If a Japanese sender sends date data to a Japanese receiver, the Japanese sender wishes to send data in Japanese calendar's date such as H13-5-31(H means Heisei era; see Appendix C Heisei) to the receiver.
<?xml version='1.0' ?> <env:Header> <WS-I18N:WSinternationalization xmlns:WS-I18N="http://example.org/2002/11/21/WS-I18N"> <WS-I18N:dataTypePreference> <ja:JDate xmlns:ja="http//example.org/2003/12/3/ja">EYY-MM-DD</WS-I18N:JDate> </WS-I18N:dataTypePreference> </WS-I18N:WSinternationalization> </env:Header> <env:Body> <departuredate>H14-5-31</departuredate> </env:Body>
Default is Locale neutral mode. If a sender and a receiver can handshake with each other using the same semantics of locale, sender can send a locale dependent data to a receiver, and the receiver can process the data consistently.
If WSDL can describe locale sensitive datatypes, locale negotiation mechanism can be described in WSDL. Is it applicable requirement for interface definition of WSDL?
Editorial note: KN | 2002-12-10 |
Possible datatype: Telephone number, ZIP code etc. Needs feasibility assessment. |
There is a difference between data types such as telephone number, ZIP code, etc., which can be modeled as strings with patterns, and data types such as dates and numbers, where a connection with the value space (e.g. of XML Schema) may be desirable.
Applies to: WSDL, SOAP, or Localizable datatype
A service is defined using "Service Oriented Architecture Derivative Patterns Intermediary" as found in Web Services Architecture document[WSA].
Service "A" is defined as a service that returns status text and deployed on Providers "A1", "A2", and "A3" in different language configurations (locales).
Provider "B" polls Service "A" on each machine and caches the results.
Service "B" is defined as a process that returns cached results and is deployed on Provider "B".
Requesters "C1", "C2", and "C3", each with different language preferences, invoke Service "B" sporadically to obtain data.
Provider "B" must cache faults and data in all possible languages since it cannot know in advance which requesters will want what data.
Provider "B" must send all data to each requester.
Correlation of data may prove difficult.
Data is formatted for an end user by the receiver according to the end user's preferences and the system conventions. This usage scenario does not raise any requirements. It provides an example of best practice.
The receiver may format data in order to display the data in a user interface. Locale sensitive data formatting functions are widely provided by internationalization functionality of operating systems, programming languages, or applications such as word processors and middleware. Therefore, an application can format locale neutral data using built-in internationalization functions. The details of data formatting vary across different systems. Therefore, Web services themselves do not guarantee completely identical formatting on different systems.
Applies to: Web Services General
Data sent by a sender is formatted by the receiver according to format(s) provided by the sender. This is different from the scenarios "3.4.1 S-021: Data Formatting for End User on Receiver Side" or "3.4.2 S-022: Data Formatting on Sender Side" in that the center of decision and the actual execution of the formatting are separated. Scenarios "3.4.1 S-021: Data Formatting for End User on Receiver Side" or "3.4.2 S-022: Data Formatting on Sender Side" should be preferred because they are more straightforward, but this scenario may be chosen for various reasons such as: The sender wants to ensure consistent appearance, and the data may be used on the receiver side both for formatting and for further processing.
Editorial note: F2F | 2002-11-23 |
We need more details and better arguments for why the server wants consistency, and maybe examples of what degree of consistency is necessary/desirable in different applications. |
Data formatting rules should be sent to receivers together with data, in order to keep consistency self-descriptively, for user interface.
Editorial note: F2F | 2002-11-23 |
There may be other ways to tell the receiver what exactly to do, e.g. by referencing rather that including formatting rules. |
The following is a pair of float data and an example of numeric formatting rule.
Value: float: 235055.55 Numeric formatting rule: #.##.##0,##
Presentation result for user interface is:
2.35.055,55
The following example shows a way to send a formatting rule in SOAP header together with a float value.
<?xml version='1.0' ?> <env:Header> <WS-I18N:WSinternationalization xmlns:WS-I18N="http://example.org/2002/11/21/WS-I18N"> <WS-I18N:presentationPreferences> <WS-I18N:NumericFormat>#.##.##0,##</WS-I18N:NumericFormat> </WS-I18N:presentationPreferences> </WS-I18N:WSinternationalization> </env:Header> <env:Body> <value>235055.55</value> </env:Body>
Because a receiver receives a value in locale neutral decimal data together with formatting rule, the receiver can format the data for user interface self-descriptively.
Applies to: Web Services General
The service provider needs the locale of the sender in order to perform locale sensitive processing. There are three levels to this: Transport layer, service provider layer (SOAP Header), and service layer (SOAP Body) (we need separate scenarios for these)
Note that this layer includes more than just Web services. It also includes XHTML [XHTML], XForms [XFORMS], and other locale-sensitive applications on the Web. The service provider receiving the document needs to know the sender's locale preferences in order to perform locale-sensitive processing. This may include routing based on language or regional preferences, selection of response language, formatting of error messages (if the SOAP document is unintelligible/non-parse-able, what language is the failure message returned in?).
As shown elsewhere, locale-sensitive processing requires a locale to be passed to the Web service somehow. Most Web services, however, are not explicitly locale-sensitive operations. They may still need a locale in their context (for obtaining resources, obtaining the preferred collation, etc.). The Web service author should not have to specify a Locale in the actual service definition, since it has no bearing on the actual data structures involved in the process. For example: xlxlx The Service Provider (SOAP container) also must process the actual SOAP message and generate SOAP fault messages. These may be locale- or language-affected.
Same locale processing does not always return same result, because there are locale implementation differences among runtime environment.
A sender(requester) wishes a provider(receiver) to return a result of specific locale sensitive processing such as Java Locale, Windows Locale, or POSIX locale etc
Because user can customize locale, a sender(requester) wishes a provider(receiver) to return a result of locale sensitive processing based on the sender customized locale.
Applies to: A locale identification which distinguishes runtime differences or users customization.
Repositories and search-able meta-data (such as UDDI [UDDI]) about Web services need to provide support for multiple language searches. Transport layer issues do not allow XML data structures to be used for resolution, except by reference (e.g. the receiver must down-load a separate resource asynchronously to "decode" the preference). Tags on this layer may be necessary in place of XML data structures.
Editorial note: KN | 2002/12/13 |
UDDI provides TAXONOMY based on the ISO 3166 Geographic Code System. |
Using a Web service to obtain some locale-related service, such as formatting, collation etc.
Some locale-related tasks are not easy to describe by an exchangeable data structure. Also, not all systems will have code for all locales. It may therefore be desirable in some cases to 'outsource' locale services.
Example 1: Formatting a date according to traditional Thai conventions (the full text of the formatted date is very long)
Editorial note: F2F | 2002-11-23 |
Add example. |
Example 2: Converting a date according to local rules that may not be predictable.
Example 3: Collation data can be very large, and collation may include additional domain-specific preprocessing in addition to the simple comparison of strings. It may therefore in some cases be more efficient to send the data to be sorted to another system.
Using Web services to update locale-related data that may change dynamically and/or in ways that are not easily predictable.
Some kinds of data needed for culturally sensitive operations is not easily predictable. Web services may be used to update such data, either by polling for updates or by sending out updates.
Example: Certain calendars depend on actually observed events, such as the actual observation of the new moon.
Editorial note: F2F | 2002-11-23 |
Add more specific details here. |
Example: Jurisdictions once in a while may change their rules for which time-zones they are in at which time of the year.
Issues: Scalability; requests may concentrate at specific times; polling may have to be repeated over and over without any actual updates.
If a Web service starts out uninternationalized and is later internationalized, it must be re-deployed as a separate service because the original service contract has changed.
The web services developer is solely responsible for supplying the fields, logic, and semantics that will be used to achieve i18n capabilities. Each service will vary in its approach and may not bother to supply a suitable mechanism. Without guidance from the client, assumptions have to be made that are unsuitable. For example, the locale of the server may be used to format the response. When the service is internationalized, the only option is to ask for locale as an additional input, changing the service contract.
There is an important functional and semantic difference between a field supplied in the actual service invocation (that is, as part of the data) and one supplied in the envelope (that is, as part of the protocol) because when supplied as part of the data, developers must always take care to create, populate, read, and process the fields. Internationalization of an existing service therefore takes the form of deploying a new service (since the inputs have changed).
By contrast, if locale and language preferences are part of the "context" (in the envelope, for example), the developer gains several advantages. First, both the provider and the service can read the locale and language preference. (The service must be provided with a specific API to obtain the locale and language from the provider, or it can be silently managed by the provider.) Services that require external environmental changes to activate their locale-sensitivity can have the provider perform this processing for them. Multiple services in the same "chain" can inherit the same locale and language context. Most important, though, the client-side environment can be optimized to provide the locale and language preferences of the end user automatically, without developers having to write code to obtain the values and populate the inputs of the Web Service. In addition, Web Service authors can add international or multi-language support to services after initial deployment without changing service descriptors (WSDL and XSD) that may already be in wide use.
There is no way for the service to communicate what language and formatting options are available.
If a Web service requires that a language, locale, or formatting preference in a service description (WSDL), how can the sender know what values will have meaning at the receiver. For example, POSIX locales for a C program are very different than (say) Windows LCID values.
As a result, developers of internationalized Web services (especially those that support multi-lingual operations - that is, servers that can provide responses in a variety of human languages and dialects) have to provide the ability to external users, whose platforms and programming languages may be maximally different than their own, to know what options the service supports.